An Exact Dynamic Programming Solution for a Decentralized Two-Player Markov Decision Process

نویسندگان

  • Jeff Wu
  • Sanjay Lall
چکیده

We present an exact dynamic programming solution for a finite-horizon decentralized two-player Markov decision process, where player 1 only has access to its own states, while player 2 has access to both player’s states but cannot affect player 1’s states. The solution is obtained by solving several centralized partially-observable Markov decision processes. We then conclude with several computational examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Application of the ABS LX Algorithm to Multiple Sequence Alignment

We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...

متن کامل

Piecewise Linear Dynamic Programming for Constrained POMDPs

We describe an exact dynamic programming update for constrained partially observable Markov decision processes (CPOMDPs). State-of-the-art exact solution of unconstrained POMDPs relies on implicit enumeration of the vectors in the piecewise linear value function, and pruning operations to obtain a minimal representation of the updated value function. In dynamic programming for CPOMDPs, each vec...

متن کامل

Modelling and Decision-making on Deteriorating Production Systems using Stochastic Dynamic Programming Approach

This study aimed at presenting a method for formulating optimal production, repair and replacement policies. The system was based on the production rate of defective parts and machine repairs and then was set up to optimize maintenance activities and related costs. The machine is either repaired or replaced. The machine is changed completely in the replacement process, but the productio...

متن کامل

Towards Computing Optimal Policies for Decentralized POMDPs

The problem of deriving joint policies for a group of agents that maximze some joint reward function can be modelled as a decentralized partially observable Markov decision process (DEC-POMDP). Significant algorithms have been developed for single agent POMDPs however, with a few exceptions, effective algorithms for deriving policies for decentralized POMDPS have not been developed. As a first ...

متن کامل

An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs

Decentralized planning in uncertain environments is a complex task generally dealt with by using a decision-theoretic approach, mainly through the framework of Decentralized Partially Observable Markov Decision Processes (DEC-POMDPs). Although DEC-POMDPS are a general and powerful modeling tool, solving them is a task with an overwhelming complexity that can be doubly exponential. In this paper...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010